Regularized estimation of large covariance matrices
نویسندگان
چکیده
This paper considers estimating a covariance matrix of p variables from n observations by either banding or tapering the sample covariance matrix, or estimating a banded version of the inverse of the covariance. We show that these estimates are consistent in the operator norm as long as (logp)/n→ 0, and obtain explicit rates. The results are uniform over some fairly natural well-conditioned families of covariance matrices. We also introduce an analogue of the Gaussian white noise model and show that if the population covariance is embeddable in that model and well-conditioned, then the banded approximations produce consistent estimates of the eigenvalues and associated eigenvectors of the covariance matrix. The results can be extended to smooth versions of banding and to non-Gaussian distributions with sufficiently short tails. A resampling approach is proposed for choosing the banding parameter in practice. This approach is illustrated numerically on both simulated and real data.
منابع مشابه
Regularized Estimation of High-dimensional Covariance Matrices
Regularized Estimation of High-dimensional Covariance Matrices
متن کاملCONDITION NUMBER REGULARIZED COVARIANCE ESTIMATION By Joong - Ho Won
Estimation of high-dimensional covariance matrices is known to be a difficult problem, has many applications, and is of current interest to the larger statistics community. In many applications including so-called the “large p small n” setting, the estimate of the covariance matrix is required to be not only invertible, but also well-conditioned. Although many regularization schemes attempt to ...
متن کاملAnalysis of Incomplete Climate Data: Estimation of Mean Values and Covariance Matrices and Imputation of Missing Values
Estimating the mean and the covariance matrix of an incomplete dataset and filling in missing values with imputed values is generally a nonlinear problem, which must be solved iteratively. The expectation maximization (EM) algorithm for Gaussian data, an iterative method both for the estimation of mean values and covariance matrices from incomplete datasets and for the imputation of missing val...
متن کاملPenalized model-based clustering with unconstrained covariance matrices.
Clustering is one of the most useful tools for high-dimensional analysis, e.g., for microarray data. It becomes challenging in presence of a large number of noise variables, which may mask underlying clustering structures. Therefore, noise removal through variable selection is necessary. One effective way is regularization for simultaneous parameter estimation and variable selection in model-ba...
متن کاملA Clt for Regularized Sample Covariance Matrices
We consider the spectral properties of a class of regularized estimators of (large) empirical covariance matrices corresponding to stationary (but not necessarily Gaussian) sequences, obtained by banding. We prove a law of large numbers (similar to that proved in the Gaussian case by Bickel and Levina), which implies that the spectrum of a banded empirical covariance matrix is an efficient esti...
متن کاملA Clt for Regularized Sample Covariance Matrices by Greg
We consider the spectral properties of a class of regularized estimators of (large) empirical covariance matrices corresponding to stationary (but not necessarily Gaussian) sequences, obtained by banding. We prove a law of large numbers (similar to that proved in the Gaussian case by Bickel and Levina), which implies that the spectrum of a banded empirical covariance matrix is an efficient esti...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2006